除了文献计量学之外,还有兴趣表征科学论文中思想数量的演变。调查此问题的一种常见方法是分析出版物的标题,以检测随着时间的推移词汇变化。以这样的概念,即短语或更具体的键形酶代表概念,将词汇多样性指标应用于标题的短语版本。因此,词汇多样性的变化被视为研究的指标,甚至可能扩展研究。因此,优化键形检测是该过程的重要方面。我们建议使用多个短语检测模型的目标,而不是仅一个,而是从源代码语料库中生产出更全面的钥匙串。这种方法的另一个潜在优势是,这些集合的联合和差异可能会提供自动化技术,以识别和省略非特异性短语。我们比较了几个短语检测模型的性能,分析每个短语集的输出,并使用四个通用的词汇多样性指标计算包含每个模型的键形的Corpora变体的词汇多样性。
translated by 谷歌翻译
放射线学使用定量医学成像特征来预测临床结果。目前,在新的临床应用中,必须通过启发式试验和纠正过程手动完成各种可用选项的最佳放射组方法。在这项研究中,我们提出了一个框架,以自动优化每个应用程序的放射线工作流程的构建。为此,我们将放射线学作为模块化工作流程,并为每个组件包含大量的常见算法。为了优化每个应用程序的工作流程,我们使用随机搜索和结合使用自动化机器学习。我们在十二个不同的临床应用中评估我们的方法,从而在曲线下导致以下区域:1)脂肪肉瘤(0.83); 2)脱粘型纤维瘤病(0.82); 3)原发性肝肿瘤(0.80); 4)胃肠道肿瘤(0.77); 5)结直肠肝转移(0.61); 6)黑色素瘤转移(0.45); 7)肝细胞癌(0.75); 8)肠系膜纤维化(0.80); 9)前列腺癌(0.72); 10)神经胶质瘤(0.71); 11)阿尔茨海默氏病(0.87);和12)头颈癌(0.84)。我们表明,我们的框架具有比较人类专家的竞争性能,优于放射线基线,并且表现相似或优于贝叶斯优化和更高级的合奏方法。最后,我们的方法完全自动优化了放射线工作流的构建,从而简化了在新应用程序中对放射线生物标志物的搜索。为了促进可重复性和未来的研究,我们公开发布了六个数据集,框架的软件实施以及重现这项研究的代码。
translated by 谷歌翻译
在本作的工作中,提出了两种基于机器学习的有限变形的本质型模型。使用输入凸神经网络,该模型是过度塑化的,各向异性的并且实现了多种凸起条件,这意味着椭圆形,因此确保了材料稳定性。第一本构模型基于一组多晶硅,各向异性和目标不变。第二种方法在变形梯度,其辅助因子和决定簇方面配制,使用组对称性来满足材料对称条件,以及数据增强以满足客观性大致。数据集的扩展为数据增强方法是基于机械考虑,不需要额外的实验或模拟数据。该模型具有高度具有挑战性的立方晶格超材料的模拟数据,包括有限变形和格子稳定性。基于在实验研究中通常应用的变形,使用适量的校准数据。虽然基于不变的模型显示了几种变形模式的缺点,但是仅基于变形梯度的模型能够非常好地再现和预测有效的材料行为,并且表现出优异的泛化能力。此外,使用分析多晶硅电位产生横向各向同性数据校准模型。在这种情况下,两种模型都表现出优异的结果,展示了PolyConvex神经网络本构模型对其他对称组的直接适用性。
translated by 谷歌翻译
The performance of inertial navigation systems is largely dependent on the stable flow of external measurements and information to guarantee continuous filter updates and bind the inertial solution drift. Platforms in different operational environments may be prevented at some point from receiving external measurements, thus exposing their navigation solution to drift. Over the years, a wide variety of works have been proposed to overcome this shortcoming, by exploiting knowledge of the system current conditions and turning it into an applicable source of information to update the navigation filter. This paper aims to provide an extensive survey of information aided navigation, broadly classified into direct, indirect, and model aiding. Each approach is described by the notable works that implemented its concept, use cases, relevant state updates, and their corresponding measurement models. By matching the appropriate constraint to a given scenario, one will be able to improve the navigation solution accuracy, compensate for the lost information, and uncover certain internal states, that would otherwise remain unobservable.
translated by 谷歌翻译
The performance of the Deep Learning (DL) models depends on the quality of labels. In some areas, the involvement of human annotators may lead to noise in the data. When these corrupted labels are blindly regarded as the ground truth (GT), DL models suffer from performance deficiency. This paper presents a method that aims to learn a confident model in the presence of noisy labels. This is done in conjunction with estimating the uncertainty of multiple annotators. We robustly estimate the predictions given only the noisy labels by adding entropy or information-based regularizer to the classifier network. We conduct our experiments on a noisy version of MNIST, CIFAR-10, and FMNIST datasets. Our empirical results demonstrate the robustness of our method as it outperforms or performs comparably to other state-of-the-art (SOTA) methods. In addition, we evaluated the proposed method on the curated dataset, where the noise type and level of various annotators depend on the input image style. We show that our approach performs well and is adept at learning annotators' confusion. Moreover, we demonstrate how our model is more confident in predicting GT than other baselines. Finally, we assess our approach for segmentation problem and showcase its effectiveness with experiments.
translated by 谷歌翻译
Landing an unmanned aerial vehicle unmanned aerial vehicle (UAV) on top of an unmanned surface vehicle (USV) in harsh open waters is a challenging problem, owing to forces that can damage the UAV due to a severe roll and/or pitch angle of the USV during touchdown. To tackle this, we propose a novel model predictive control (MPC) approach enabling a UAV to land autonomously on a USV in these harsh conditions. The MPC employs a novel objective function and an online decomposition of the oscillatory motion of the vessel to predict, attempt, and accomplish the landing during near-zero tilt of the landing platform. The nonlinear prediction of the motion of the vessel is performed using visual data from an onboard camera. Therefore, the system does not require any communication with the USV or a control station. The proposed method was analyzed in numerous robotics simulations in harsh and extreme conditions and further validated in various real-world scenarios.
translated by 谷歌翻译
We develop theory and methods that use the graph Laplacian to analyze the geometry of the underlying manifold of point clouds. Our theory provides theoretical guarantees and explicit bounds on the functional form of the graph Laplacian, in the case when it acts on functions defined close to singularities of the underlying manifold. We also propose methods that can be used to estimate these geometric properties of the point cloud, which are based on the theoretical guarantees.
translated by 谷歌翻译
Nearly all jurisdictions in the United States require a professional license exam, commonly referred to as "the Bar Exam," as a precondition for law practice. To even sit for the exam, most jurisdictions require that an applicant completes at least seven years of post-secondary education, including three years at an accredited law school. In addition, most test-takers also undergo weeks to months of further, exam-specific preparation. Despite this significant investment of time and capital, approximately one in five test-takers still score under the rate required to pass the exam on their first try. In the face of a complex task that requires such depth of knowledge, what, then, should we expect of the state of the art in "AI?" In this research, we document our experimental evaluation of the performance of OpenAI's `text-davinci-003` model, often-referred to as GPT-3.5, on the multistate multiple choice (MBE) section of the exam. While we find no benefit in fine-tuning over GPT-3.5's zero-shot performance at the scale of our training data, we do find that hyperparameter optimization and prompt engineering positively impacted GPT-3.5's zero-shot performance. For best prompt and parameters, GPT-3.5 achieves a headline correct rate of 50.3% on a complete NCBE MBE practice exam, significantly in excess of the 25% baseline guessing rate, and performs at a passing rate for both Evidence and Torts. GPT-3.5's ranking of responses is also highly-correlated with correctness; its top two and top three choices are correct 71% and 88% of the time, respectively, indicating very strong non-entailment performance. While our ability to interpret these results is limited by nascent scientific understanding of LLMs and the proprietary nature of GPT, we believe that these results strongly suggest that an LLM will pass the MBE component of the Bar Exam in the near future.
translated by 谷歌翻译
The future of population-based breast cancer screening is likely personalized strategies based on clinically relevant risk models. Mammography-based risk models should remain robust to domain shifts caused by different populations and mammographic devices. Modern risk models do not ensure adaptation across vendor-domains and are often conflated to unintentionally rely on both precursors of cancer and systemic/global mammographic information associated with short- and long-term risk, respectively, which might limit performance. We developed a robust, cross-vendor model for long-term risk assessment. An augmentation-based domain adaption technique, based on flavorization of mammographic views, ensured generalization to an unseen vendor-domain. We trained on samples without diagnosed/potential malignant findings to learn systemic/global breast tissue features, called mammographic texture, indicative of future breast cancer. However, training so may cause erratic convergence. By excluding noise-inducing samples and designing a case-control dataset, a robust ensemble texture model was trained. This model was validated in two independent datasets. In 66,607 Danish women with flavorized Siemens views, the AUC was 0.71 and 0.65 for prediction of interval cancers within two years (ICs) and from two years after screening (LTCs), respectively. In a combination with established risk factors, the model's AUC increased to 0.68 for LTCs. In 25,706 Dutch women with Hologic-processed views, the AUCs were not different from the AUCs in Danish women with flavorized views. The results suggested that the model robustly estimated long-term risk while adapting to an unseen processed vendor-domain. The model identified 8.1% of Danish women accounting for 20.9% of ICs and 14.2% of LTCs.
translated by 谷歌翻译
Large language models (LLMs) have demonstrated impressive capabilities in natural language understanding and generation, but the quality bar for medical and clinical applications is high. Today, attempts to assess models' clinical knowledge typically rely on automated evaluations on limited benchmarks. There is no standard to evaluate model predictions and reasoning across a breadth of tasks. To address this, we present MultiMedQA, a benchmark combining six existing open question answering datasets spanning professional medical exams, research, and consumer queries; and HealthSearchQA, a new free-response dataset of medical questions searched online. We propose a framework for human evaluation of model answers along multiple axes including factuality, precision, possible harm, and bias. In addition, we evaluate PaLM (a 540-billion parameter LLM) and its instruction-tuned variant, Flan-PaLM, on MultiMedQA. Using a combination of prompting strategies, Flan-PaLM achieves state-of-the-art accuracy on every MultiMedQA multiple-choice dataset (MedQA, MedMCQA, PubMedQA, MMLU clinical topics), including 67.6% accuracy on MedQA (US Medical License Exam questions), surpassing prior state-of-the-art by over 17%. However, human evaluation reveals key gaps in Flan-PaLM responses. To resolve this we introduce instruction prompt tuning, a parameter-efficient approach for aligning LLMs to new domains using a few exemplars. The resulting model, Med-PaLM, performs encouragingly, but remains inferior to clinicians. We show that comprehension, recall of knowledge, and medical reasoning improve with model scale and instruction prompt tuning, suggesting the potential utility of LLMs in medicine. Our human evaluations reveal important limitations of today's models, reinforcing the importance of both evaluation frameworks and method development in creating safe, helpful LLM models for clinical applications.
translated by 谷歌翻译